Signal Processing Issues in Realizing Voice Input to Computers
نویسنده
چکیده
In this paper we discuss some issues in processing speech signals, especially for isolated utterances of characters of a language. For processing this speech signal we have no clues of higher level linguistic information such a s prosodics, lexical, syntax, and semantics. Any representation of signals in terms of fixed parameters for each short (10-20 msec) segment is not likely to provide the distinguishing features of the sounds of the characters for recognition. Processing of speech signal based on the knowledge of acoustic-phonetics of the characters of the language will enable us to identify features for discriminating different sounds. We discuss how these features are related to the parameters of the signal. For illustration, we consider the acoustic-phonetic knowledge of the Indian language Hindi. The discussion in this paper shows the need for new methods of processing signals to realize voice input to a computer.
منابع مشابه
طراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملIncorporating Voice Dialogs in a Multi-user Virtual Environment
The applications of 3D virtual environments and voice user interface (VUI) on personal computers has received significant attentions in recent years. Since speech is the most natural way of communication, incorporating VUI into virtual environments can greatly enhance user interaction and immersiveness. Although there have been many researches addressing the issue of integrating VUI and 3D virt...
متن کاملScientific bases of human-machine communication by voice.
The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organ...
متن کامل